109 research outputs found
Doctor of Philosophy
dissertationThis dissertation describes our work on design, fabrication and characterization of plasmonic metamaterials and tapered structures, with primary focus on their applications at terahertz (THz) frequencies. The phenomena associated with these structures rely on surface plasmon polaritons (SPPs), which may allow for high field enhancement and tight field confinement. We have investigated the underlying mechanisms of these structures and used that knowledge to develop unique and practical applications. We first studied two-dimensional periodic and random lattices based on aperture arrays, and modified the model to describe the effective dielectric response of the perforated metallic medium. Using two layers of the perforated stainless steel films, we demonstrated the emergence of an additional resonance and reproduced the transmission spectra using the effective dielectric model of the single-layer medium. Also, we improved the filtering performance of the multilayer periodic aperture arrays by adjusting the relative distance and angle between the layers, and demonstrated its application as a high quality bandpass filter. Then, we examined the transmission properties of graphite and carbon nanotube (CNT) films, and then the same films perforated with periodically distributed aperture arrays. The extracted dielectric constants of the graphite and CNT films demonstrate their availability for THz surface plasmonic devices. Moreover, we developed a narrow band/multiband THz detector in which the photoconductive antenna was surrounded by periodically corrugated gratings. This detector not only enhanced the sensitivity of detection at the specific frequencies, but also efficiently collected the radiation within the structure area, which obviated the need for a substrate lens. Finally, we improved the concentration properties of conically tapered apertures. Based on the optimal taper angle we determined, we introduced various modifications to the individual tapered aperture, e.g., to form an array and insert a gap spacing, and further enhanced the concentration capabilities and realized complete broadband transmission. Based on these studies and results, we are currently extending our work towards development of more reconfigurable and active devices that could enrich the available pool of THz and optical devices. Furthermore, such THz devices have great promise for the development of THz systems level applications and even a THz-based world in the future
Genomic integrative analysis to improve fusion transcript detection, liquid association and biclustering
More data provide more possibilities. Growing number of genomic data provide new perspectives to understand some complex biological problems. Many algorithms for single-study have been developed, however, their results are not stable for small sample size or overwhelmed by study-specific signals. Taking the advantage of high throughput genomic data from multiple cohorts, in this dissertation, we are able to detect novel fusion transcripts, explore complex gene regulations and discovery disease subtypes within an integrative analysis framework.
In the first project, we evaluated 15 fusion transcript detection tools for paired-end RNA-seq data. Though no single method had distinguished performance over the others, several top tools were selected according to their F-measures. We further developed a fusion meta-caller algorithm by combining top methods to re-prioritize candidate fusion transcripts. The results showed that our meta-caller can successfully balance precision and recall compared to any single fusion detection tool.
In the second project, we extended liquid association to two meta-analytic frameworks (MetaLA and MetaMLA). Liquid association is the dynamic gene-gene correlation depending on the expression level of a third gene. Our MetaLA and MetaMLA provided stronger detection signals and more consistent and stable results compared to single-study analysis. When applied our method to five Yeast datasets related to environmental changes, genes in the top triplets were highly enriched in fundamental biological processes corresponding to environmental changes.
In the third project, we extended the plaid model from single-study analysis to multiple cohorts for bicluster detection. Our meta-biclustering algorithm can successfully discovery biclusters with higher Jaccard accuracy toward large noise and small sample size. We also introduced the concept of gap statistic for pruning parameter estimation. In addition, biclusters detected from five breast cancer mRNA expression cohorts can successfully select genes highly associated with many breast cancer related pathways and split samples with significantly different survival behaviors.
In conclusion, we improved the fusion transcripts detection, liquid association analysis and bicluster discovery through integrative-analysis frameworks. These results provided strong evidence of gene fusion structure variation, three-way gene regulation and disease subtype detection, and thus contribute to better understanding of complex disease mechanism ultimately
Nonlinear resonances of electrostatically actuated nano-beam
Nonlinear response of electrostatically actuated nano-beam near-half natural frequency is studied by considering the nonlinearities of the large deformation, electrostatic force and Casimir effect. A first-order fringe correction of the electrostatic force, large deformation, viscous damping, and Casimir effect are included in the dynamic model. The dynamics of the resonator are investigated by using the method of multiple scales in a direct approach to the problem. The sufficient conditions of guaranteeing the system stability and a saddle-node bifurcation are studied. The influences of large deformation, damping, actuation, and fringe effect on the resonator response are studied. The peak amplitude of the primary resonance is given in the paper. Numerical simulations are conducted for uniform nano-beam
UniDistill: A Universal Cross-Modality Knowledge Distillation Framework for 3D Object Detection in Bird's-Eye View
In the field of 3D object detection for autonomous driving, the sensor
portfolio including multi-modality and single-modality is diverse and complex.
Since the multi-modal methods have system complexity while the accuracy of
single-modal ones is relatively low, how to make a tradeoff between them is
difficult. In this work, we propose a universal cross-modality knowledge
distillation framework (UniDistill) to improve the performance of
single-modality detectors. Specifically, during training, UniDistill projects
the features of both the teacher and the student detector into Bird's-Eye-View
(BEV), which is a friendly representation for different modalities. Then, three
distillation losses are calculated to sparsely align the foreground features,
helping the student learn from the teacher without introducing additional cost
during inference. Taking advantage of the similar detection paradigm of
different detectors in BEV, UniDistill easily supports LiDAR-to-camera,
camera-to-LiDAR, fusion-to-LiDAR and fusion-to-camera distillation paths.
Furthermore, the three distillation losses can filter the effect of misaligned
background information and balance between objects of different sizes,
improving the distillation effectiveness. Extensive experiments on nuScenes
demonstrate that UniDistill effectively improves the mAP and NDS of student
detectors by 2.0%~3.2%
VIP5: Towards Multimodal Foundation Models for Recommendation
Computer Vision (CV), Natural Language Processing (NLP), and Recommender
Systems (RecSys) are three prominent AI applications that have traditionally
developed independently, resulting in disparate modeling and engineering
methodologies. This has impeded the ability for these fields to directly
benefit from each other's advancements. With the recent development of
foundation models, large language models have emerged as a potential
general-purpose interface for unifying different modalities and problem
formulations. In light of this, we propose the development of a multimodal
foundation model (MFM) considering visual, textual, and personalization
modalities under the P5 recommendation paradigm, thus named VIP5 (Visual P5),
to unify various modalities and recommendation tasks. This will enable the
processing of multiple modalities in a shared architecture for improved
recommendations. To achieve this, we introduce multimodal personalized prompts
to accommodate multiple modalities under a shared format. Additionally, we
propose a parameter-efficient training method for foundation models, which
involves freezing the P5 backbone and fine-tuning lightweight adapters,
resulting in improved recommendation performance and increased efficiency in
terms of training time and memory usage. Code and data of VIP5 are available at
https://github.com/jeykigung/VIP5.Comment: Accepted by EMNLP 202
Learning Personalized Risk Preferences for Recommendation
The rapid growth of e-commerce has made people accustomed to shopping online.
Before making purchases on e-commerce websites, most consumers tend to rely on
rating scores and review information to make purchase decisions. With this
information, they can infer the quality of products to reduce the risk of
purchase. Specifically, items with high rating scores and good reviews tend to
be less risky, while items with low rating scores and bad reviews might be
risky to purchase. On the other hand, the purchase behaviors will also be
influenced by consumers' tolerance of risks, known as the risk attitudes.
Economists have studied risk attitudes for decades. These studies reveal that
people are not always rational enough when making decisions, and their risk
attitudes may vary in different circumstances.
Most existing works over recommendation systems do not consider users' risk
attitudes in modeling, which may lead to inappropriate recommendations to
users. For example, suggesting a risky item to a risk-averse person or a
conservative item to a risk-seeking person may result in the reduction of user
experience. In this paper, we propose a novel risk-aware recommendation
framework that integrates machine learning and behavioral economics to uncover
the risk mechanism behind users' purchasing behaviors. Concretely, we first
develop statistical methods to estimate the risk distribution of each item and
then draw the Nobel-award winning Prospect Theory into our model to learn how
users choose from probabilistic alternatives that involve risks, where the
probabilities of the outcomes are uncertain. Experiments on several e-commerce
datasets demonstrate that our approach can achieve better performance than many
classical recommendation approaches, and further analyses also verify the
advantages of risk-aware recommendation beyond accuracy
KuaiSim: A Comprehensive Simulator for Recommender Systems
Reinforcement Learning (RL)-based recommender systems (RSs) have garnered
considerable attention due to their ability to learn optimal recommendation
policies and maximize long-term user rewards. However, deploying RL models
directly in online environments and generating authentic data through A/B tests
can pose challenges and require substantial resources. Simulators offer an
alternative approach by providing training and evaluation environments for RS
models, reducing reliance on real-world data. Existing simulators have shown
promising results but also have limitations such as simplified user feedback,
lacking consistency with real-world data, the challenge of simulator
evaluation, and difficulties in migration and expansion across RSs. To address
these challenges, we propose KuaiSim, a comprehensive user environment that
provides user feedback with multi-behavior and cross-session responses. The
resulting simulator can support three levels of recommendation problems: the
request level list-wise recommendation task, the whole-session level sequential
recommendation task, and the cross-session level retention optimization task.
For each task, KuaiSim also provides evaluation protocols and baseline
recommendation algorithms that further serve as benchmarks for future research.
We also restructure existing competitive simulators on the KuaiRand Dataset and
compare them against KuaiSim to future assess their performance and behavioral
differences. Furthermore, to showcase KuaiSim's flexibility in accommodating
different datasets, we demonstrate its versatility and robustness when
deploying it on the ML-1m dataset
A Dataset And Benchmark Of Underwater Object Detection For Robot Picking
Underwater object detection for robot picking has attracted a lot of
interest. However, it is still an unsolved problem due to several challenges.
We take steps towards making it more realistic by addressing the following
challenges. Firstly, the currently available datasets basically lack the test
set annotations, causing researchers must compare their method with other SOTAs
on a self-divided test set (from the training set). Training other methods lead
to an increase in workload and different researchers divide different datasets,
resulting there is no unified benchmark to compare the performance of different
algorithms. Secondly, these datasets also have other shortcomings, e.g., too
many similar images or incomplete labels. Towards these challenges we introduce
a dataset, Detecting Underwater Objects (DUO), and a corresponding benchmark,
based on the collection and re-annotation of all relevant datasets. DUO
contains a collection of diverse underwater images with more rational
annotations. The corresponding benchmark provides indicators of both efficiency
and accuracy of SOTAs (under the MMDtection framework) for academic research
and industrial applications, where JETSON AGX XAVIER is used to assess detector
speed to simulate the robot-embedded environment
- …